Symmetric Distortion Measure for Speaker Recognition

نویسندگان

  • Evgeny Karpov
  • Tomi Kinnunen
  • Pasi Fränti
چکیده

We consider matching functions in vector quantization (VQ) based speaker recognition systems. In VQ-based systems, a speaker model consists of a small collection of representative vectors, and matching is performed by computing a dissimilarity value between the unknown speaker’s feature vectors and the speaker models. Typically, the average/total quantization error is used as the dissimilarity measure. However, this measure lack the symmetricity requirement of a proper distance measure. This is counterintuitive because match score between speakers X and Y is different from the match score between Y and X . Furthermore, the distortion measure can yield a zero value (perfect match) for non-identical vector sets, which is undesirable. In this study, we study ways of making the quantization distortion functions proper distance measures. The study includes discussion of the theoretical properties of different measures, as well as an evaluation on a subset of the NIST99 speaker recognition evaluation corpus.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the use of asymmetric-shaped tapers for speaker verification using i-vectors

This paper presents asymmetric-shaped tapers (or windows) for speaker recognition. Symmetric tapers (e.g., hamming), having the linear phase property and longer time delay, are widely used for short-time analysis of speech signals. Since human speech perception is relatively insensitive to short-time phase distortion, the linearity constraint on phase can be removed without any adverse effects....

متن کامل

Class-Discriminative Weighted Distortion Measure for VQ-based Speaker Identification

We consider the distortion measure in vector quantization based speaker identification system. The model of a speaker is a codebook generated from the set of feature vectors from the speakers voice sample. The matching is performed by evaluating the distortions between the unknown speech sample and the models in the speaker database. In this paper, we introduce a weighted distortion measure tha...

متن کامل

Using second order statistics for text independent speaker verification

This paper describes a computationally simple method to perform text independent speaker verification using second order statistics. The suggested method, called Utterance Level Scoring (ULS), allows obtaining a normalized score using a single pass through the frames of the tested utterance. The utterance sample covariance is first calculated and then compared to the speaker covariance using a ...

متن کامل

Text-independent speaker verification using utterance level scoring and covariance modeling

This paper describes a computationally simple method to perform text independent speaker verification using second order statistics. The suggested method, called utterance level scoring (ULS), allows obtaining a normalized score using a single pass through the frames of the tested utterance. The utterance sample covariance is first calculated and then compared to the speaker covariance using a ...

متن کامل

Perceptual Significance of Cepstral Distortion Measures in Digital Speech Processing

Currently, one of the most widely used distance measures in speech and speaker recognition is the Euclidean distance between mel frequency cepstral coefficients (MFCC). MFCCs are based on filter bank algorithm whose filters are equally spaced on a perceptually motivated mel frequency scale. The value of mel cepstral vector, as well as the properties of the corresponding cepstral distance, are d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004